In [1]:
import pandas as pd
from bokeh.charts import Line
from bokeh.models import GlyphRenderer, HoverTool, Range1d, LinearAxis
from bokeh.palettes import Spectral4
from bokeh.io import output_notebook, show
output_notebook()


BokehJS successfully loaded.

What's the latest on worldmapper's science research map

Inspired by the article "This map of the world’s scientific research is disturbingly unequal" http://qz.com/449405/this-map-of-the-worlds-scientific-research-is-disturbingly-unequal/ - which is a very interesting read and points out some interesting problems with the original metric, I thought I'd see what the change had been, even if for a flawed metric.

http://www.worldmapper.org/display.php?selected=205

Scientific papers cover physics, biology, chemistry, mathematics, clinical medicine, biomedical research, engineering, technology, and earth and space sciences.

The number of scientific papers published by researchers in the United States was more than three times as many as were published by the second highest-publishing population, Japan.

There is more scientific research, or publication of results, in richer territories. This locational bias is such that roughly three times more scientific papers per person living there are published in Western Europe, North America, and Japan, than in any other region.


Territory size shows the proportion of all scientific papers published in 2001 written by authors living there.


In [2]:
# The original data souce was the world bank in 2005.
# Retrieved more recent data from: 
# http://databank.worldbank.org/data/reports.aspx?source=world-development-indicators#
# aggregated by country income level 

data = pd.read_excel('Data_Extract_From_World_Development_Indicators.xlsx', skiprows=[6,7,8,9,10])
data


Out[2]:
Series Name Series Code Country Name Country Code 2005 [YR2005] 2006 [YR2006] 2007 [YR2007] 2008 [YR2008] 2009 [YR2009] 2010 [YR2010] 2011 [YR2011] 2012 [YR2012] 2013 [YR2013] 2014 [YR2014]
0 Scientific and technical journal articles IP.JRN.ARTC.SC United States USA 205564.6 209272.3 209898.0 212883.0 208600.8 .. .. .. .. ..
1 Scientific and technical journal articles IP.JRN.ARTC.SC Upper middle income UMC 75308.2 86889.8 98451.9 109621.2 119248.5 122109.6 135289.7 .. .. ..
2 Scientific and technical journal articles IP.JRN.ARTC.SC Lower middle income LMC 22131.5 24360.2 26136.2 27411.3 28479.7 29832.2 31918.9 .. .. ..
3 Scientific and technical journal articles IP.JRN.ARTC.SC High income HIC 611257.8 627881.9 633008.4 645394.4 639843.8 403896.7 413798.7 .. .. ..
4 Scientific and technical journal articles IP.JRN.ARTC.SC Low income LIC 738.0 875.2 978.9 894.6 973.3 977.5 1004.3 .. .. ..

In [3]:
columns = ['%s [YR%s]' % (y, y) for y in range(2005, 2012)]
columns.append('Country Name')
data = data[columns].set_index('Country Name')
data = data.rename(columns={
 '2005 [YR2005]': 2005,
 '2006 [YR2006]': 2006,
 '2007 [YR2007]': 2007,
 '2008 [YR2008]': 2008,
 '2009 [YR2009]': 2009,
 '2010 [YR2010]': 2010,
 '2011 [YR2011]': 2011,
})
data = data.transpose()
data


Out[3]:
Country Name United States Upper middle income Lower middle income High income Low income
2005 205564.6 75308.2 22131.5 611257.8 738
2006 209272.3 86889.8 24360.2 627881.9 875.2
2007 209898 98451.9 26136.2 633008.4 978.9
2008 212883 109621.2 27411.3 645394.4 894.6
2009 208600.8 119248.5 28479.7 639843.8 973.3
2010 .. 122109.6 29832.2 403896.7 977.5
2011 .. 135289.7 31918.9 413798.7 1004.3

In [4]:
# Compensate for the lack of the US in the 2010, 2011 High income stats

data.loc[2010, 'High income'] = data.loc[2010, 'High income'] + data.loc[2009, 'United States']
data.loc[2011, 'High income'] = data.loc[2011, 'High income'] + data.loc[2009, 'United States']

In [5]:
data = data[['Low income','Lower middle income', 'Upper middle income', 'High income']]
data = data / 1000
data


Out[5]:
Country Name Low income Lower middle income Upper middle income High income
2005 0.738 22.1315 75.3082 611.2578
2006 0.8752 24.3602 86.8898 627.8819
2007 0.9789 26.1362 98.4519 633.0084
2008 0.8946 27.4113 109.6212 645.3944
2009 0.9733 28.4797 119.2485 639.8438
2010 0.9775 29.8322 122.1096 612.4975
2011 1.0043 31.9189 135.2897 622.3995

In [6]:
line = Line(data, legend='top_right', 
            tools='pan,hover,wheel_zoom,resize,previewsave,reset', 
            palette=Spectral4, width=800, ylabel="'000s of articles published per year"
            )

lines = line.select({'type': GlyphRenderer})
hover = line.select({'type': HoverTool})
axes = line.select({'type': LinearAxis})

line.x_range = Range1d(2004, 2014)
for axis in axes:
    axis.axis_label_text_font_size = '10pt'
    axis.axis_label_standoff = 15

In [7]:
for l in lines:
    l.glyph.line_width = 5
    l.glyph.line_cap = 'round'

In [8]:
hover.tooltips = [('# scientific & engineering articles published', '$y')]
hover.point_policy = 'follow_mouse'

In [9]:
show(line)


Indicator details

Indicator Name Scientific and technical journal articles (IP.JRN.ARTC.SC)

Long definition Scientific and technical journal articles refer to the number of scientific and engineering articles published in the following fields: physics, biology, chemistry, mathematics, clinical medicine, biomedical research, engineering and technology, and earth and space sciences.

Source National Science Foundation, Science and Engineering Indicators.

Topic Infrastructure: Technology

Periodicity Annual

Aggregation method Sum

Statistical concept and methodology The number of scientific and engineering articles published in the following fields: physics, biology, chemistry, mathematics, clinical medicine, biomedical research, engineering and technology, and earth and space sciences. The NSF considers article counts from a set of journals covered by Science Citation Index (SCI) and Social Sciences Citation Index (SSCI).

Development relevance A scientific journal is a periodical publication intended to further the progress of science, usually by reporting new research. Most journals are highly specialized, although some of the oldest journals such as Nature publish articles and scientific papers across a wide range of scientific fields. Scientific journals contain articles that have been peer reviewed. When a scientific journal describes experiments or calculations, they must supply enough details that an independent researcher could repeat the experiment or calculation to verify the results. Each such journal article becomes part of the permanent scientific record. Some journals, such as Nature, Science, Proceedings of the National Academy of Sciences of the United States of America (PNAS), and Physical Review Letters, have a reputation of publishing articles that mark a fundamental breakthrough in their respective fields.

Limitations and exceptions Scientific and technical article counts are from journals classified by the Institute for Scientific Information's Science Citation Index (SCI) and Social Sciences Citation Index (SSCI). Counts are based on fractional assignments; articles with authors from different countries are allocated proportionately to each country. The SCI and SSCI databases cover the core set of scientific journals but may exclude some of local importance and may reflect some bias toward English-language journals. Articles are classified by year of publication and assigned to region/country/economy on basis of institutional address(es) listed on the article. Articles are counted on a fractional-count basis that is, for articles with collaborating institutions from multiple countries/economies, each country/economy receives fractional credit on basis of proportion of its participating institutions. Details may not add to total because of rounding.

License Type Open


In [ ]: